15 research outputs found

    Comparative genomics of Burkholderia multivorans, a ubiquitous pathogen with a highly conserved genomic structure

    Get PDF
    The natural environment serves as a reservoir of opportunistic pathogens. A well-established method for studying the epidemiology of such opportunists is multilocus sequence typing, which in many cases has defined strains predisposed to causing infection. Burkholderia multivorans is an important pathogen in people with cystic fibrosis (CF) and its epidemiology suggests that strains are acquired from non-human sources such as the natural environment. This raises the central question of whether the isolation source (CF or environment) or the multilocus sequence type (ST) of B. multivorans better predicts their genomic content and functionality. We identified four pairs of B. multivorans isolates, representing distinct STs and consisting of one CF and one environmental isolate each. All genomes were sequenced using the PacBio SMRT sequencing technology, which resulted in eight high-quality B. multivorans genome assemblies. The present study demonstrated that the genomic structure of the examined B. multivorans STs is highly conserved and that the B. multivorans genomic lineages are defined by their ST. Orthologous protein families were not uniformly distributed among chromosomes, with core orthologs being enriched on the primary chromosome and ST-specific orthologs being enriched on the second and third chromosome. The ST-specific orthologs were enriched in genes involved in defense mechanisms and secondary metabolism, corroborating the strain-specificity of these virulence characteristics. Finally, the same B. multivorans genomic lineages occur in both CF and environmental samples and on different continents, demonstrating their ubiquity and evolutionary persistence

    Phylogenomic study of Burkholderia glathei-like organisms, proposal of 13 novel Burkholderia species and emended descriptions of Burkholderia sordidicola, Burkholderia zhejiangensis, and Burkholderia grimmiae

    Get PDF
    Partial gyrB gene sequence analysis of 17 isolates from human and environmental sources revealed 13 clusters of strains and identified them as Burkholderia glathei Glade (BGC) bacteria. The taxonomic status of these clusters was examined by whole-genome sequence analysis, determination of the G+C content, whole-cell fatty acid analysis and biochemical characterization. The whole-genome sequence-based phylogeny was assessed using the Genome Blast Distance Phylogeny (GBDP) method and an extended multilocus sequence analysis (MLSA) approach. The results demonstrated that these 17 BGC isolates represented 13 novel Burkholderia species that could be distinguished by both genotypic and phenotypic characteristics. BGC strains exhibited a broad metabolic versatility and developed beneficial, symbiotic, and pathogenic interactions with different hosts. Our data also confirmed that there is no phylogenetic subdivision in the genus Burkholderia that distinguishes beneficial from pathogenic strains. We therefore propose to formally classify the 13 novel BGC Burkholderia species as Burkholderia arvi sp. nov. (type strain LMG 29317(T) = CCUG 68412(T)), Burkholderia hypogeia sp. nov. (type strain LMG 29322(T) = CCUG 68407(T)), Burkholderia ptereochthonis sp. nov. (type strain LMG 29326(T) = CCUG 68403(T)), Burkholderia glebae sp. nov. (type strain LMG 29325(T) = CCUG 68404(T)), Burkholderia pedi sp. nov. (type strain LMG 29323(T) = CCUG 68406(T)), Burkholderia arationis sp. nov. (type strain LMG 29324(T) = CCUG 68405(T)), Burkholderia fortuita sp. nov. (type strain LMG 29320(T) = CCUG 68409(T)), Burkholderia temeraria sp. nov. (type strain LMG 29319(T) = CCUG 68410(T)), Burkholderia calidae sp. nov. (type strain LMG 29321(T) = CCUG 68408(T)), Burkholderia concitans sp. nov. (type strain LMG 29315(T) = CCUG 68414(T)), Burkholderia turbans sp. nov. (type strain LMG 29316(T) = CCUG 68413(T)), Burkholderia catudaia sp. nov. (type strain LMG 29318(T) = CCUG 68411(T)) and Burkholderia peredens sp. nov. (type strain LMG 29314(T) = CCUG 68415(T)). Furthermore, we present emended descriptions of the species Burkholderia sordidicola, Burkholderia zhejlangensis and Burkholderia grimmiae. The GenBank/EMBUDDBJ accession numbers for the 16S rRNA and gyrB gene sequences determined in this study are LT158612-LT158624 and LT158625-LT1158641, respectively

    Introducing SPeDE : high-throughput dereplication and accurate determination of microbial diversity from matrix-assisted laser desorption-ionization time of flight mass spectrometry data

    Get PDF
    The isolation of microorganisms from microbial community samples often yields a large number of conspecific isolates. Increasing the diversity covered by an isolate collection entails the implementation of methods and protocols to minimize the number of redundant isolates. Matrix-assisted laser desorption-ionization time-of-flight (MALDI-TOF) mass spectrometry methods are ideally suited to this dereplication problem because of their low cost and high throughput. However, the available software tools are cumbersome and rely either on the prior development of reference databases or on global similarity analyses, which are inconvenient and offer low taxonomic resolution. We introduce SPeDE, a user-friendly spectral data analysis tool for the dereplication of MALDI-TOF mass spectra. Rather than relying on global similarity approaches to classify spectra, SPeDE determines the number of unique spectral features by a mix of global and local peak comparisons. This approach allows the identification of a set of nonredundant spectra linked to operational isolation units. We evaluated SPeDE on a data set of 5,228 spectra representing 167 bacterial strains belonging to 132 genera across six phyla and on a data set of 312 spectra of 78 strains measured before and after lyophilization and subculturing. SPeDE was able to dereplicate with high efficiency by identifying redundant spectra while retrieving reference spectra for all strains in a sample. SPeDE can identify distinguishing features between spectra, and its performance exceeds that of established methods in speed and precision. SPeDE is open source under the MIT license and is available from https://github.com/LM-UGent/SPeDE. IMPORTANCE Estimation of the operational isolation units present in a MALDI-TOF mass spectral data set involves an essential dereplication step to identify redundant spectra in a rapid manner and without sacrificing biological resolution. We describe SPeDE, a new algorithm which facilitates culture-dependent clinical or environmental studies. SPeDE enables the rapid analysis and dereplication of isolates, a critical feature when long-term storage of cultures is limited or not feasible. We show that SPeDE can efficiently identify sets of similar spectra at the level of the species or strain, exceeding the taxonomic resolution of other methods. The high-throughput capacity, speed, and low cost of MALDI-TOF mass spectrometry and SPeDE dereplication over traditional gene marker-based sequencing approaches should facilitate adoption of the culturomics approach to bacterial isolation campaigns

    Comparative genomics of Pandoraea, a genus enriched in xenobiotic biodegradation and metabolism

    Get PDF
    Comparative analysis of partial gyrB, recA, and gltB gene sequences of 84 Pandoraea reference strains and field isolates revealed several clusters that included no taxonomic reference strains. The gyrB, recA, and gltB phylogenetic trees were used to select 27 strains for whole-genome sequence analysis and for a comparative genomics study that also included 41 publicly available Pandoraea genome sequences. The phylogenomic analyses included a Genome BLAST Distance Phylogeny approach to calculate pairwise digital DNA-DNA hybridization values and their confidence intervals, average nucleotide identity analyses using the OrthoANIu algorithm, and a whole-genome phylogeny reconstruction based on 107 single-copy core genes using bcgTree. These analyses, along with subsequent chemotaxonomic and traditional phenotypic analyses, revealed the presence of 17 novel Pandoraea species among the strains analyzed, and allowed the identification of several unclassified Pandoraea strains reported in the literature. The genus Pandoraea has an open pan genome that includes many orthogroups in the 'Xenobiotics biodegradation and metabolism' KEGG pathway, which likely explains the enrichment of these species in polluted soils and participation in the biodegradation of complex organic substances. We propose to formally classify the 17 novel Pandoraea species as P. anapnoica sp. nov. (type strain LMG 31117(T) = CCUG 73385(T)), P. anhela sp. nov. (type strain LMG 31108(T) = CCUG 73386(T)), P. aquatica sp. nov. (type strain LMG 31011(T) = CCUG 73384(T)), P. bronchicola sp. nov. (type strain LMG 20603(T) = ATCC BAA-110(T)), P. capi sp. nov. (type strain LMG 20602(T) = ATCC BAA-109(T)), P. captiosa sp. nov. (type strain LMG 31118(T) = CCUG 73387(T)), P. cepalis sp. nov. (type strain LMG 31106(T) = CCUG 39680(T)), P. commovens sp. nov. (type strain LMG 31010(T) = CCUG 73378(T)), P. communis sp. nov. (type strain LMG 31110(T) = CCUG 73383(T)), P. eparura sp. nov. (type strain LMG 31012(T) = CCUG 73380(T)), P. horticolens sp. nov. (type strain LMG 31112(T) = CCUG 73379(T)), P. iniqua sp. nov. (type strain LMG 31009(T) = CCUG 73377(T)), P. morbifera sp. nov. (type strain LMG 31116(T) = CCUG 73389(T)), P. nosoerga sp. nov. (type strain LMG 31109(T) = CCUG 73390(T)), P. pneumonica sp. nov. (type strain LMG 31114(T) = CCUG 73388(T)), P. soli sp. nov. (type strain LMG 31014(T) = CCUG 73382(T)), and P. terrigena sp. nov. (type strain LMG 31013(T) = CCUG 73381(T))

    Phylogenomic analysis showing the relatedness of the genomes in terms of sequence divergence of the panorthologs.

    No full text
    <p>The maximum likelihood tree was inferred using the GTRGAMMA substitution model and is based on a concatenated nucleotide alignment of 4,503 CDS (4,457,847 positions). The percentage of replicate trees in which the associated taxa clustered together in the bootstrap analyses (1,000 replicates) are shown next to the branches. Scale bar represents number of substitutions per site. The tree was rooted on the branch with the largest branch length.</p

    The frequency of orthologous versus non-orthologous CDS varies among chromosomes and COG categories.

    No full text
    <p>Bar plots show the number of orthologous and non-orthologous CDS per chromosome (X<sup>2</sup>(2) = 213.4, p<0.001) <b>(a)</b> and COG category (X<sup>2</sup>(22) = 5101.2, p<0.001) <b>(c)</b>. Mosaic plots show the standardized residuals of the Pearson chi-square analysis for the number of orthologous and non-orthologous CDS per chromosome <b>(b)</b>. Solid and dashed boundaries represent positive and negative residuals, respectively. Rectangles are colored only if the standardized residual is significant at p<0.05 (outside ±1.96). COG categories: J, translation, ribosomal structure and biogenesis; K, transcription; L, replication, recombination and repair; B, chromatin structure and dynamics; D, cell cycle control, cell division, chromosome partitioning; V, defense mechanisms; T, signal transduction mechanisms; M, cell wall/membrane/envelope biogenesis; N, cell motility; W, extracellular structures; U, intracellular trafficking, secretion, and vesicular transport; O, posttranslational modification, protein turnover, chaperones; X, mobilome: prophages, transposons; C, energy production and conversion; G, carbohydrate transport and metabolism; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; H, coenzyme transport and metabolism; I, lipid transport and metabolism; P, inorganic ion transport and metabolism; Q, secondary metabolites biosynthesis, transport and catabolism; R, general function prediction only; S, function unknown.</p
    corecore